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ABSTRACT 


This  exposition  presents  a state-of-the-art  survey  of  models  and 
algorithms  for  the  convex  cost  network  flow  problem. 
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I.  INTRODUCTION 


The  convex  C04-t  neXlW^Ji  ^Zou!  p^LobZzm  may  be  easily  described  in  terms 
of  a distribution  problem  over  a directed  graph  [V,  E],  where  V is  the  node 
set  and  E is  the  arc  set.  The  capacity  of  arc  j is  given  by  , and  the 
vector  of  all  capacities  is  denoted  by  u.  The  decision  variable  Xj  denotes 
the  flow  in  arc  j and  the  vector  of  all  flows  is  denoted  by  x.  The  convex 
cost  function  associated  with  the  flow  in  [V,  E]  is  given  by  g(x).  Math- 
ematically, the  convex  cost  network  flow  problem  may  be  stated  as  follows: 


min  g(x) 

(1) 

• 

ft 

• 

N 

(2) 

0 ^ X ^ u. 

(3) 

where  A is  a node-arc  incidence  matrix  for  [V,  E],  and  r is  the  vector  of 
requirements.  If  r^  > 0,  then  node  i is  a supply  node  with  supply  equal 
r^.  If  r^  < 0,  then  node  i is  a demand  node  with  demand  equal  Nodes 


having  r^  “ 0 are  transhipment  nodes.  We  assume  that  r^ 


r^  is  zero. 


in  which  case  total  supply  equals  total  demand.  If  r^  < 0,  no  feasible 
solution  exists.  If  r^  > 0,  we  may  place  the  problem  in  the  prescribed 
form  by  adding  a dtnmny  demand  node  having  demand  r^  and  extra  arcs  from 
each  supply  node  to  the  dummy  node.  In  this  case  g(x)  remains  a function 
of  the  original  arc  flows  only. 

The  convex  cost  network  flow  problem  is  simply  a specially  structured 
nonlinear  program  and  may  be  solved  with  any  of  a host  of  techniques. 
However,  due  to  the  underlying  network  structure,  specializations  of  these 
approaches  have  been  developed.  This  exposition  presents  a summary  of  the 
best  known  applications  of  convex  cost  network  flow  problems  as  well  as  a 
unification  of  the  algorithmic  approaches  available*  for  these  problems. 
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Our  objectives  have  been  to  (i)  help  researchers  place  their  work  in  the  proper 
context  with  the  existing  literature  and  (ii)  to  help  practitioners  in  algorithm 
; selection  by  presenting  the  algorithms  available  in  a uniform  notation.  Further- 

1 more,  implementation  has  been  the  underlying  theme  which  has  guided  our  presentation 

j 

I of  the  algorithms. 
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II.  APPLICATIONS 


There  are  two  basic  types  of  problems  which  are  modelled  as  convex 
cost  network  flow  problems,  zquAJUJbfium  pAobZzmi  and  pAoduxitLon-dutxibutLon 
jViobZejM . Equilibrium  models  appear  in  studies  involving  urban  transporta- 
tion systems,  pipe  network  systems,  and  electrical  network  systems.  A 
summary  of  the  various  models  which  have  appeared  in  the  literature  follows. 
Convex  cost  network  flow  problems  may  also  arise  as  subproblems  when  using 
penalty  or  barrier  techniques  on  nonlinear  programs  having  network  constraints 
as  a proper  subset  of  the  constraint  set  [e.g.  see  3,  11,  33,  41,  49]. 
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2.1  Equilibrium  Flow  Models 


The  equilibrium  problem  most  widely  studied  by  operations  researchers 

involves  the  distribution  of  traffic  in  urban  transportation  networks.  For 

this  type  network  the  nodes  represent  zones  or  intersections  of  streets  in 

a metropolitan  area  while  the  arcs  represent  streets,  expressways,  tollways, 

and  so  forth.  The  supplies  and  demands  are  given  by  the  number  of  vehicles 

which  travel  between  zones  during  some  period  of  time  (for  example  the 

morning  rush  hour) . This  model  differs  from  other  models  presented  in  this 

exposition  in  that  it  has  a multicommodity  nature.  The  commodities  may  be 

viewed  as  either  the  flows  between  individual  origin-destination  pairs  or 

the  flows  from  each  origin  to  all  its  destinations.  To  minimize  the  number 

of  variables,  the  latter  approach  is  adopted  here.  Let  the  sources  (comm- 

Ic 

odities)  be  indexed  by  l,...,p  and  let  x^  denote  the  flow  of  commodity  k 
on  arc  j.  Lee  the  function  fj(z)  denote  the  travel  time  for  each  vehicle 
on  arc  j when  the  total  number  of  vehicles  using  arc  j is  z. 

There  are  two  basic  models  which  have  appeared  in  the  transportation 
literature  corresponding  to  Wardrop's  [44]  two  principles  of  traffic  flow. 
The  first  principle  Is  stated  as  follows: 


The  journey  times  for  all  vehicles  with  the  same  orgin 
and  destination  are  equal  and  no  greater  than  the  journey 
time  which  would  be  experienced  by  a single  vehicle  on 
any  unused  route. 

This  is  called  the  p^ncA-pZe.  0^  zqu/lt  tXOLVZJt  and  is  reflected  in  the 

following  model: 


rain 


V k 


f^(t)dt 


(4) 


0 
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s.t. 

k k 

Ax  = r , 

(all  k) 

(5) 

^ 0 , 

(all  k) 

(6) 

Ic  Ic 

where  x is  the  vector  of  flows  for  commodity  k and  r is  the  vector  of 

requirements  for  commodity  k.  It  is  assximed  that  f ^ (t)  for  each  j is  cont- 
inuous and  monotone  increasing  over  0.  The  continuity  of  f^Ct)  guaran- 
tees the  existence  of  the  integral  while  the  assumption  of  a monotone  in- 
creasing fxmctlon  guarantees  convexity  (see  Roberts  and  Varberg  [37]).  The 
model  (4)  - (6)  is  generally  called  the  iLquJJUb-%Lim  <U4^gnme>tC  pfioblzm 

and  its  solution  is  often  referred  to  as  a user  optimized  flow  pattern. 

Chames  and  Cooper  [8]  in  1961  were  among  the  first  to  present  (4)  - (6) 
as  a model  of  Wardrop's  first  principle.  Since  then  a series  of  studies 
directed  Coward  developing  efficient  specialized  algorithms  for  the 
equilibrium  traffic  assignment  problem  have  appeared  in  the  literature 
[17,  18,  24,  26,  30,  35,  36].  In  addition  Dafermos  [13,  14]  have  extended 
this  model  for  two-way  streets  and  multiple  classes  of  users. 

Wardrop's  second  principle  called  the  pA^ncipte.  0^  ovzAoZZ  rnAKoriizciCLon 
may  be  stated  as  follows: 

fiom  nAz  diit’Ubutzd  ovsA.  tkz  oAci,  thz  neJMOAk  -ot 
4uc/i  CL  manneA.  that  tkz  4um  oi  tvivzL  tanzi>  ioA  aJU.  tuzAi 
mbuMized. 


The  corresponding  model  is  simply 

Tn^  n 


h 'J’ 


s.t.  Ax 


“ r 

> 0 


(all  k) 
(all  k). 


The  solution  to  this  model  is  often  referred  to  as  a 6MtZJn  cptinczzd 
pattern.  Jorgensen  [26]  was  among  the  first  to  investigate  this  model.  Other 
equilibrium  models  related  to  traffic  assignment  may  be  found  in  [1,  18,  43]. 
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The  problem  of  finding  steady-state  flows  and  pressures  in  a pipe 
network  has  been  of  interest  to  civil  engineers  and  city  planners  for  more 
than  forty  years.  In  1936  Hardy  Cross  [12]  developed  the  first  algorithm 
for  solving  such  problems.  His  procedure  is  simply  an  iterative  method 
for  solving  a system  of  nonlinear  equations  which  describe  the  equilibrium 
behavior  of  pipe  networks.  Recently  Hall  [20]  and  Collins  et.  al.  [9] 
discovered  that  this  problem  could  be  modelled  as  a convex  cost  network 
flow  problem.  For  this  type  network  the  nodes  represent  reservoirs  and  pipe 
intersections  while  the  arcs  represent  pumps,  pipes,  and  valves  of  various 
types.  The  model  is  simply 


min  Zj  ^ f^  (t)  dt 


s.t.  Ax  = r (8) 

X > 0,  (9) 

where  for  pipes,  pumps,  and  valves,  f j (t)  denotes  the  head  loss  (pressure 
drop)  in  arc  j as  a function  of  flow  in  arc  j and  for  artificial  pipes 
connecting  the  ground  to  a reservoir,  f ^ (t)  is  a constant.  The  decision 
variable  Xj  is  the  flow  in  arc  j in  units  of  volume  per  unit  time.  As 
before  f j (t)  is  assumed  to  be  continuous  and  monotone  increasing. 

The  problem  of  finding  the  steady-state  currents  and  voltages  for  a 
nonlinear  resistive  electrical  network  may  also  be  modelled  as  a convex  cost 
network  flow  problem  [7,  IQ,  16,  23].  For  this  type  network,  nodes  represent 
points  of  connection  while  arcs  represent  the  various  electrical  components. 
The  model  is  Identical  to  (7)  - (9)  where  for  all  components  except  voltage 
sources,  ^ j denotes  the  voltage  drop  across  the  element  represented  by 

arc  j as  a function  of  current  flow  in  arc  j . For  voltage  sources  which 
are  connected  to  the  ground  by  arc  j,  ^ j ^ constant.  The  decision 


variable  denotes  the  current  in  arc  j.  Each  (t)  is  assumed  to  be  both 
continuous  and  monotone  increasing. 

We  note  that  both  the  pipe  network  model  and  the  electrical  network 
model  each  have  special  structure  which  can  be  exploited  in  developing 
solution  techniques.  Even  though  the  particular  cost  functions  used  differ, 
both  are  separable  cind  convex.  Hence,  we  conjecture  that  an  algorithm  which 
proves  to  be  effective  for  one  of  these  models  should  also  work  well  on 
the  other  model. 


L 


2.2  Production-Distribution  Models 


Distribution  problems  in  which  demands  are  given  by  random  variables 
rather  than  by  known  constants  may  be  modelled  as  convex  cost  network  flow 
problems.  Let  dC  V denote  the  destinations  whose  demands  are  random 
variables.  Then  the  requirements  for  k £ D are  random  variables.  Let 
for  all  k £ D denote  the  density  functions  corresponding  to  these 
random  variables.  Further,  let  h^^  denote  the  unit  holding  cost  and  s^  denote 
the  unit  shortage  cost  at  destination  k.  Then  the  objective  is  to  choose 
amounts  to  be  shipped  in  order  to  minimize  the  shipping  cost  plus  the  expected 
holding  and  shortage  costs.  Mathematically  this  problem  may  be  stated  as 
follows : 


■ ' r. 


min  Z.-  c.x.  + 
jeE  j j 


'k£D  \ 


(|rj 


- t)fj^(t)dt  + 


^k£D  ®k 


Itkl)  fj^(t)dt 


s . t . Ax  = r 

0 ^ x ^ u. 

Note  that  for  the  demand  nodes,  the  requirements  are  variables  rather  than 
constants.  That  is,  | r^^  | for  all  k £ D represents  the  total  amount  shipped 
into  node  k to  meet  its  unknown  demand.  Problems  of  this  class  were  first 
suggested  by  Dantzig  [15].  Special  procedures  for  models  of  this  type  may 
be  found  in  (11,  32]. 

Distribution  problems  for  which  the  demands  are  known  constants  but  in 
which  some  of  the  supplies  at  the  production  facilities  are  unknown  may  also 
be  modelled  as  convex  cost  network  flow  problems.  The  basic  underlying  idea 
is  that  production  facilities  can  increase  their  capacities  by  any  of  a series 
increasingly  expensive  additions  of  equipment  and  services.  For  this  type 
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model  some  of  Che  requirements,  r,  , are  decision  variables.  Let  f.  (r  ) 

K tc  k 

denote  the  total  production  cost  at  facility  k for  a production  level  of 
of  Then,  we  obtain  the  following  model: 


s.  t.  Ax  ■ r 
0 < X ^ u. 

Algorithms  for  models  of  this  type  have  been  developed  by  Sharp  et.  al.  [40], 
and  LeBlanc  and  Cooper  [31]. 

A convex  cost  network  flow  model  has  recently  been  proposed  by  Rosenthal 
[39]  to  study  multireservoir  water  release  scheduling  for  the  Tennessee 
Valley  Authority.  The  objective  is  to  maximize  the  benefit  of  hydroelectric 
power.  This  benefit  function  reflects  the  savings  of  thermal  fuels  that  result 
from  using  hydroelectric  power.  For  this  model  the  nodes  represent  reservoirs 
at  a given  time  period  to  a different  reservoir  at  a later  time  period  or  (ii) 
a reservoir  connected  to  itself  at  a later  time  period.  One  type  of  arc  allows 
for  the  release  of  water  while  the  other  allows  water  to  be  stored. 


III.  iMGORITHMS 


There  are  basically  two  types  of  algorithms  which  have  been  proposed 
for  solving  the  convex  cost  network  flow  problem,  app-Xo XAjnatio n mdtliods  and 
dLxzctLon  mzthod-i.  Approximation  methods  use  a piece-wise  linear 
approximation  for  g(x)  and  solve  the  resulting  linear  network  flow  problem. 
Feasible  direction  methods,  however,  work  directly  with  the  nonlinear  cost 
function  g(x).  Let  y denote  a feasible  point  for  (1)  - (3).  We  define  a 
dLxzctton  to  be  any  nonzero  vector,  A direction  d is  said  to  be  a 
cLcxzctLon  at  y if  y + \d  is  feasible  for  some  X > 0.  A direction  d is  said 
to  be  an  -onpAou-oig  cLLxzction  if  d is  a feasible  direction  and  the 

directional  derivative  of  g in  the  direction  d at  the  point  y is  negative. 
Given  the  above  definitions,  the  feasible  direction  algorithm  may  be  described 
as  follows : 

ALG-1;  FEASIBLE  VUECTION  ALGORITHM 

0.  hiLtoztLzitCion 

Let  y^  be  any  feasible  point,  let  £ > 0 be  a termination  tolerance, 
and  set  k 0. 

1.  FTjid  An  ImpAov-cng  FzcLi-tblz  VLxzctLon 

Let  d be  an  improving  feasible  direction  at  y,  . 

k 

2.  Lbiz  SzcLXch 

it 

Find  a such  that 

•k 

g(y  + ot  d)  = min  {g(y  + ad):  A(y,  + ad)  = r,  0 < v,  + ad  < u . 

Vl  ^k  ^ ^ ^ If  < e terminate; 

otherwise,  go  to  1. 
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The  major  difficulty  with  the  above  algorithm  is  the  determination  of 
an  appropriate  direction  finding  program  for  step  1.  The  various  methods 
differ  in  the  means  for  finding  d.  However,  all  the  methods  presented  in 


this  exposition  make  use  of  the  directional  derivative.  Recall  that  the 
directional  derivative  of  g(x)  at  y in  the  direction  d with  Euclidean  norm 
I d I is  given  by 


D^jg(7)  - 


lim 
t ^ 0 


t 


providing  the  limit  exists  [28].  A more  computationally  useful  form  of  the 
directional  derivative  is  given  by 


D^g(y) 


7g(y)d 


where  ^g(y)  is  the  gradient  ef  g evaluated  at  y. 


Since  D^g(y)  < 0 whenever  Vg(y)d  < 0,  Vg(y)d  is  the  expression  used  in  all 
feasible  direction  methods  for  the  selection  of  an  improving  direction. 

The  particular  t3rpe  of  line  search  used  in  step  2 usually  depends 
on  whether  or  not  g is  differentiable.  If  g is  differentiable  and  the  deriva- 
tives are  easily  calculated,  then  a b^zctlon  JtZcuxcJi  [3,  33,  49]  is  usually 
used;  otherwise,  either  a goldzn  -izcLtLon  or  F-ibonaccl  itZaxcSi  [3,  33,  41,  49] 
is  used. 
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3.1  Piece-Wise  Linear  Approximation 


In  order  to  simplify  the  presentation  we  assume  that  g(x)  is  separable, 
i.e.  g(x)  = g^(Xj).  Suppose  K.  linear  segments  are  to  be  used  in  the 

approximation.  Then  the  interval  [0,  u^ ] is  partitioned  into  K segments 

k k 

each  of  length  u for  k = 1,...,  K,  such  that  u = Z u..  Let  v?  denote  the 
J J ^ 3 ^ J 

• O k“T)  k k 

right  end  point  for  the  pth  segment.  That  is,  vi  = Z u..  Let  x denote  the 

J K-1  j j 

flow  in  segment  k of  arc  j with  unit  cost  of  Then  x = E x*^ 

y i k j 

Substituting  x.  = E,  x into  (1)  - (3)  vields 

J k J 

v-  k k 

min  , c,  X. 

j .k  j J 

St  Z Ax^  = r 

k 

0 ^ x^  ^ u^  , (all  k) 

where  x^  is  a vector  of  all  flows  for  the  k*"^  segment  and  u^  is  a vector  of 
bounds  for  the  k^^  segment.  There  are  numerous  techniques  [7,  9,  10]  which 


may  be  applied  to  determine  the  unit  costs,  c^  . An  extension  of  the  idea 

of  least  squares  has  been  used  in  [9,  10].  To  apply  this  approach  we  define 

tk/  \ k,  k..  k*“  1 \ 

h (x.)  = p + C.  • (x.  - V.  ) 

J 3 3 J 3 J 

k-l  k 0 k 

for  the  interval  v where  v.  = 0.  Then  c.  is  selected  as  the 

3 - J - J J J 

value  which  minimizes 


L 


( k 
"3 


k-l 

V , 


[gj(t)  - hj(t)]^  dt. 


3 1 

For  the  first  segment,  p = g (0)  and  for  all  other  segments  p.  = h.  (v  , 

3 3 J J j 

The  formula  which  yields  this  minimization  may  be  found  in  [10]. 
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3.2  Frank-Wolfe  MeChod 


In  1956  Frank  and  Wolfe  [19]  proposed  a nethod  for  solving 
nonlinear  programs  having  a convex  differentiable  cost  function  and  linear 
constraints.  Given  a feasible  point  at  iteration  k for  (1)  - (3),  say 
one  finds  another  feasible  point,  say  by  solving  the  linear  network 
flow  problem  min  {Vg(yj^)x  : Ax=*r,  O^x^u}.  Asa  by-prodoct  of 
the  solution  of  the  linear  network  problem,  one  gets  a lower  bound. 

Consider  the  following  preposition: 

Proposition  1. 

Let  X denote  an  optimum  for  (1)  - (3)  and  let  y be  any  feasible  point 
for  (1)  - (3).  Let  2 solve  the  linear  program  min  {Vg(y)x  : Ax  - r,  0 ^ x £ 
Then  g(x)  > g(y)  + 7g(y)*(2  - y). 

Proof.  By  Theorem  4 page  72  of  [29]  we  have  g(x)  ^ g(y)  + Vg(y)»(x  - y) . 
Then  g(x)  > g(y)  + Vg(y)x  - Vg(y)y.  But  Vg(y)x  > Vg(y)z.  Therefore, 

g(x)  > g(y)  + Vg(y)  z - Vg(y)y  =.  g(y)  + Vg(y).(z  - y)  . 

Using  the  lower  bound  provided  by  the  above  proposition,  the  Frank-Wolfe 
algorithm  may  be  stated  as  follows: 

A LG- 2:  FRANK-WOLFE  ALGORITHM 

0 . I nitiatizatio  n 

Let  y^  be  any  feasible  solution.  Set  k 0,  initialize  the  lower  bound 
S and  choose  the  termination  parameter  £ > 0. 

1.  Solve.  SleJlvoAJz.  Suhpn.obtm,  Update  Bound,  Check  Foa  TevninatLon 

Let  z denote  the  solution  to  min  {7g(yj^)x  : Ax  * r,  0 ^ x ^ u}. 

Set  S ■»-  max  {0,  g(yj^)  + ^g(yj^)*(z  - y^^)  ^ • If  ®^^k^  - 0 ’^  £, 
terminate  with  y^^  as  an  £-optimun;  otherwise,  k k +1. 


Luie  Search 


Let  be  the  point  on  the  line  segment  between  and  z having 

smallest  objective  function  value  and  go  to  1. 

It  is  well-known  that  the  convergence  rate  of  this  procedure  slows  sub- 
stantially as  the  optimum  is  approached  [9,  17,  18,  32,  35].  Several  modifi- 
cations of  the  basic  algorithm  have  been  proposed  to  enhance  the  computational 
effectiveness  of  this  procedure.  These  are  designed  to  avoid  the  zigzag 
character  frequently  exhibited  by  this  procedure.  Figure  1 illustrates 
this  phenomenon. 


Figure  1 About  Here 


The  method  of  parallel  tangents  (PARTAN)  [33]  integrates  movements  in 
the  direction  9)  along  with  the  basic  Frank-Wolfe  steps.  The  algorithm 

is  given  below. 

ALG-3:  FRANK-WOLFE  ALGORITHM  WITH  PARTAN 

0 . I nitLaZizatLc  ki 

(same  as  ALG-2) 

1.  Sotvt  NzXuio^k  SabpfLoblzm,  Updatz  Bound,  Chzck  FoA.  TzA/ninaXion 

(same  as  ALG-2) 

2 . Linz  SzcLTch 

Let  w be  the  point  on  the  line  segment  between  y^^  ^ and  z having 
smallest  objective  function  value.  If  k = 1,  y^^  w,  and  go  to  1. 
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PAIWM  Stzp 


r 


J . 

Let  be  the  feasible  point  having  smallest  objective  function  value 
on  the  half  ray  from  yj^_2  through  w,  and  go  to  1. 

Figure  2 illustrates  the  Frank-Wolfe  method  using  PARIAN. 

Figure  2 About  Here 


In  1970  Wolfe  [47]  presented  a modification  of  the  Frank-Wolfe 
algorithm  which  incorporated  what  is  called  _rvU,'a^  4.tep4.  The  modified 
algorithm  is  described  below. 

ALG-3:  FRANK-WOLFE  ALGORITHM  WITH  AWA/  STEPS 

0.  InitioLizcutLon 

(same  as  ALG-2) 

1.  F-ind  ToMOAd  Stzp  VtAzction,  Updatz  Sound,  CkzcJz  Foa  Tz-vnauitLon 

(same  as  ALG-2  with  k k + 1 deleted) 

2.  Awat/  Stzp  V-oizctlon 

Let  w denote  the  solution  to 

max  {Vg(yj^)x  : Ax-r,  0£x£u,  Xj  “Oif  (yj^)j  “0} 

Set  y,  - w. 

k 

3.  F-ind  Mux  Movzmznt  In  Aimy  V-iAZction 

Let  denote  the  solution  to 

max  {a  : A(  w + ctd^)  ” r,  0 £ w + ad^  £ u} . 
a 0 

If  £ 1 , go  to  5 . 

4.  Szlzct  V-Oizctian 

If  l'7g(yj^)2|  > |Vg(yj^)wi,  go  to  5;  otherwise,  go  to  6. 
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5.  Line.  StiVidi  W^tk  Toward  P-ctecti.cn 


Let  be  the  point  on  the  line  segment  between  and  z having 

smallest  objective  value.  Set  k k + 1 and  go  to  1. 

6.  Line  SzoAck  W-ct^i  Auiau  P-ttectcon 

* 

Let  a denote  the  solution  to 

min  Set  k -»•  k + 1 , yj^ + a*d*  , 

1 _<  a £ 

and  go  to  1, 

The  Frank  -Wolfe  method  using  away  steps  is  illustrated  in  Figure  3. 


Figure  3 .\bout  Here 


Holloway  [22]  proposed  a different  extension  of  the  Frank-Wolfe 

method.  This  extension  requires  that  y^  and  each  z produced  in  step 

1 of  ALG-2  be  saved.  Suppose  that  after  k passes  through  step  1 

that  these  solutions  are  denoted  z Then  Holloway  suggests 

0 k— 1 

that  the  line  search  of  step  2 ALG-2  be  replaced  by  the  following 
problem, 


min  giygA  + z.a.) 


s.t. 


A, a.  > 0,  (all  i). 


1 — 


(10) 


* * 


Letting  A ,a^  for  i = 0,...,k-l,  denote  the  solution  to  (10), 

* - * 
then  y A y + I.  z.ot.. 

•'  0 111 
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The  computacional  experimentation  as  represented  by  [17,  22,  32] 
using  both  the  away  steps  and  Holloway’s  extension  on  different  classes 


of  problems  is  inconclusive.  These  extensions  have  been  of  benefit  on 
some  classes  of  problems  but  have  slowed  computational  times  for 

others. 

I 

i 


3.3  Zoutendijk's  Method  of  Feasible  Directions 


In  this  section  we  present  a specialization  of  the  work  of  Zoutendijk  [50] 
to  the  convex  cost  network  flow  problem,  (1)  - (3).  We  assume  that  g(x)  is 
differentiable  over  {x  : 0 _<  x _<  u} . Given  a feasible  point  y,  any  feasible 
direction  d must  satisfy  A(y  + ad)  = r for  some  a > 0.  Clearly,  for  (1)  - (3) 
any  feasible  direction  d must  have  Ad  = 0.  Furthermore, 


d.  > 

1 - 

0. 

for 

all 

j 

such 

that 

yj  -0 

d.  < 
j - 

0, 

for 

all 

j 

such 

that 

y.  = u. 
J 3 

Then  an  appropriate  direction  finding  program  is  the  following: 


min  7g(y)d 
s . t . Ad  = 0 


d. 

3 

> 

0, 

for 

all 

j 

such 

that 

yj  = 0 

d. 

3 

< 

0. 

for 

all 

j 

such 

that 

y.  - u 

d| 

1. 

(11) 

(12) 

(13) 

(14) 

(15) 


Due  to  the  nonlinear  constraint  (15),  (11)  - (15)  is  difficult  to  solve. 
Hence  a relaxation  of  (11)  - (22)  where  (15)  is  replaced  by 
-1,  < d < 

is  usually  solved.  The  relaxed  problem  is  simply  a linear  network  flow 
problem.  If  an  improving  feasible  direction  for  (11)  - (15)  exists,  then 
an  optimum  solution  to  the  relaxed  problem  will  be  an  improving  feasible 
direction  although  not  necessarily  the  best  local  direction. 
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Beale  [4]  was  the  first  to  apply  a feasible  direction  algorithm  (similar 
to  that  described  by  Zoutendijk)  to  the  convex  cost  network  flow  problem.  How- 
ever, this  work  was  designed  for  a bipartite  structure.  Furthermore,  the  partial 

derivatives  are  approximated  and  the  flows  are  always  changed  by  a single 
unit.  This  approach  does  not  require  that  g(x)  be  differentiable,  it  avoids 
the  need  for  a line  search,  and  it  is  applicable  to  integer  as  well  as 
continuous  problems.  Extensions  of  this  algorithm  to  an  arbitrary  network 
have  been  given  by  Hu  [23]  and  Klein  [27].  Other  similar  approaches, may  be 
found  in  [34]  and  [45]. 


3.4  Rosen's  Gradient  Projection  Method 


The  gradient  projection  method  [38]  is  motivated  by  a desire  to  implement 
the  feasible  direction  philosophy  while  not  requiring  the  solution  of  a linear 
program  at  each  iteration.  The  strategy  employed  is  to  move,  if  possible,  in 
an  improving  feasible  direction,  derived  from  the  negative  gradient,  so  that  all 
active  (currently  binding)  constraints  remain  active.  The  negative  gradient 
is  projected  onto  a subspace  of  the  feasible  region  which  insures  that  all 
active  constraints  remain  active.  If  the  projection  is  a non-zero  vector,  it 
will  lie  along  an  improving  feasible  direction.  If  the  projection  is  the  zero 
vector,  then  either  optimality  can  be  verified  or  one  attempts  to  find  an 
improving  feasible  direction  for  which  one  or  more  binding  constraints  become 
non-binding.  The  procedure  guarantees  that  either  an  improving  feasible  direc- 
tion can  be  found  or  optimality  can  be  verified.  For  the  convex  cost  network 
flow  problem,  the  equality  constraints  (2)  are  always  active;  whereas,  the  in- 
equality constraints  (3)  may  be  either  active  or  inactive  at  any  given  step. 

Given  a feasible  point  y,  the  subspace  tangent  to  the  active  constraints 
is  given  by  the  set  M = {d  : Ad  = 0,  d^  = 0 if  y^  = 0 or  y^  = u^ } . It  is 
well-known  that  one  row  of  A is  redundant  [2],  and  that  A has  rank  equal  to 
the  number  of  rows  less  one.  Let  A denote  the  A matrix  with  one  row  deleted. 

Let  r = {j  : yj  = 0 or  y^  = Uj } and  let  k denote  the  order  of  F.  Let 
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where  j.  £ F for  i = and  e.  is  a column  vector  with  a one  in  positron 

^ ■^i  _ 

i.  and  zeroes  elsewhere.  We  assume  that  A has  full  row  rank,  and  we  let  M = 

1 

— ?1 

{d;  Ad“0}  . A subspace  orthogonal  to  M is  defined  by  N = {h  = A^’a  : a e E ; 

where  p is  the  row  dimension  of  A (i.e.  the  number  of  nodes  + k minus  1). 

The  following  proposition  proves  that  M and  N are  orthogonal. 

Proposition  2. 

M and  N are  orthogonal. 

Proof.  Let  d e M and  let  h e N.  Then  h can  be  represented  as  h * A'*a  for 
some  a E . Then  d^h  » d^A^a  = (Ad) "a  » 0. 

Then  any  vector  in  E^,  can  be  represented  by  d + h where  d e M and 
h e N.  In  particular,  the  negative  of  the  gradient  can  be  represented  this 
way.  That  is  -Vg(y)  “ d + A^a  for  some  a e E^.  But  Ad  = -A7g(y)  - A A'*a  = 0 
implies  a - -(A  A^)"^  A7g(y).  Since  A has  full  row  rank  the  existence  of 

(A  a')  ^ is  gtiaranteed  [5]. 

Therefore  d • -Vg(y)  + A'(A  A^)  ^ A Vg(y) 

- -(I  - A'(A  A')“^  A ]Vg(y). 

Letting  P = ( I - A'’  (A  A"* ) ^ A ] we  have 

d - - PVg(y) , 

where  P is  called  the  projection  matrix.  That  is,  Px  for  any  vector  x 
is  the  projection  of  x on  the  subspace  M.  The  following  proposition  proves 
that  d ■ - P7g(y)  is  an  improving  feasible  direction,  when  d ^ 0. 

Proposition  3. 

Vg(y)d  < 0,  where  d = - PVg(y)  f 0. 

Proof.  7g(y)d  - [Vg(y)  + d - d]d  » [Vg(y)  + d]d  - dd.  But  Vg(y)  + d is 
orthogonal  to  d.  Hence  Vg(y)d  » -dd  < 0 for  d 0. 

If  -P7g(y)  - 0,  then  either  optimality  may  be  verified,  or  one  seeks 
an  improving  feasible  direction  by  allowing  movement  of  the  variables  corres- 
ponding to  active  constraints  from  (3) . Verification  of  optimality  requires 
straight-forward  Kuhn-Tucker  analysis,  the  details  of  which  may  be  found  in 
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[33].  When  -PVg(y)  = 0 and  a variable  for  which  movement  is  to  be  allowed 
is  identified,  the  corresponding  row  of  (3)  is  removed  from  A.  This  gives 
rise  to  a new  projection  matrix  P and  a newly  projected  vector  -PVg(y).  This 
process  continues  until  either  an  improving  feasible  direction  is  obtained 
or  optimality  is  verified. 

A major  part  of  the  computational  burden  of  the  gradient  projection  method 
involves  recomputing  the  projection  matrix  when  A changes.  However,  A k'  is 
a symmetric  matrix  and  specialized  inversion  techniques  and  data  structures 
may  be  used  to  maintain  (A  k')  Furthermore,  all  changes  in  A k'  may  be 
accomplished  by  adding  or  deleting  a single  row  from  A.  Hence  changes  in 
(A  a')  ^ may  be  obtained  by  a simple  updating  scheme  which  may  be  applied 
several  times  each  iteration.  These  updating  procedures  are  specializations 
of  the  following  proposition. 

Proposition  4. 

Consider  the  partitioned  matrices 


where  B = A and  A^  ^ and  ^ are  assumed  to  exist.  Then 
“ A^"^  + Aj^"^  A^  A3 

A^  Q"^, 

®3  * ^3 

B4  " 


where  Q =•  (A^  - A3A3  ^A2) ; and 

A - B,  - B,B,“^B.. 
1 1 2 4 3 


A proof  of  Proposition  4 may  be  found  in  [6]. 
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Then  by  Proposition  4, 


(A  A')"^+  q(A  A")"^  Ae^e^  A"  (A  A0“^  [ -q  (A  A')"^  Ae 

I 

1 

I 

-qe'rdl")"^  I q 


where  q ■ 

1 - e'  a'  (a  A')“^  Ae 

3 3 

Suppose  that  the  last  row,  say  e^,  is  to  be  removed  from  A.  That  is 


Let  [a'  j q]  denote  the  last  row  of  (A  A')  Then 


(A  A') 


Then  by  Proposition  4, 
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is  transforaed 


Of  course,  if  the  row  to  be  deleted  is  not  the  last,  (A  A^)  ^ 
by  interchanging  both  the  last  row  with  the  one  to  be  deleted  and  the  last 
colum  with  the  one  to  be  deleted,  before  applying  the  updating  formula. 

The  general  algorithm  proceeds  as  follows : 

ALG-4:  GRAVKNT  PTiOJECTlON  ALGORITHM 

0 . I uLtiaZiza-tion 

Let  y^  be  any  feasible  solution  and  set  k 0.  Develop  A and  (A  A")~^. 
This  inverse  may  be  developed  by  beginning  with.  (S  A.')  ^ and  applying  (16) 
successively. 

1.  CalcLLlatz  Pxojzctlon 

Set  d ■*-  -(I  - k'  (A  k')  ^ A)Vg(yj^). 

2 . LiilZ  SzcLick 

■k 

If  d = 0,  go  to  4,  otherwise,  let  a denote  the  optimum  of 

min  {g(y,  + ctd)  : 0 _<  y + ad  _<  u}. 
a > 0 ^ ^ 

>'k+l  >’k  ^ - 5.  + 1 

3,  C'mck  Eluding  Conitxalnti 

For  each  constraint  from  (3)  which  is  binding  and  is  not  accounted  for 

in  A,  append  a row  to  A and  update  (A  k')  ^ using  (16)-  Go  to  1. 

4.  C/iecfe  ^ox  To.'uninadion 

Set  6 ■*-  - (k  k')  ^ A Vg(y).  Let  Jq  {j  : 6^  > 0 and  row  j of  A corres- 
ponds to  a variable  at  zero}.  Let  J (i  : 5.  < 0 and  tow  i of  A 

u j 

corresponds  to  a variable  at  upper  bound}.  If  i_'  J = $,  terminate 
with  y optimal;  otherwise,  select  any  k e J , delete  row  k from 

A,  update  (A  A ) using  (17),  and  go  to  1. 

The  major  disadvantage  of  the  above  procedure  for  convex  cost  network  flow 
problems  is  that  the  network  structure  cannot  he  easily  exploited. 
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However,  A X'  can  easily  be  derived  from  the  network  structure.  The 
element  in  the  ith  diagonal  position  is  the  number  of  arcs  incident 
with  the  node  represented  by  row  i of  A and  for  i ^ j,  the  (i,j)th  element 
is  the  negative  of  the  number  of  arcs  connecting  the  nodes  represented  by 
rows  i and  j . However  we  know  of  no  way  to  develop  (A  X')  ^ using  only 
graphical  operations.  Also  note  that  Ae^  will  always  consist  of  the 
sth  column  of  A with  additional  zeroes  appended  at  the  bottom. 
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3.5  Simplex  Based  Methods 


There  are  two  simplex  type  methods  (the  convex  simplex  method  of 
Zangwill  [48]  and  the  reduced  gradient  method  of  Wolfe  [46])  which  may  be  used 
to  solve  convex  cost  network  flow  problems.  Both  techniques  mav  be  viewed 
as  generalizations  of  the  linear  simplex  method.  They  adhere  to  the  simplex 
strategy  of  partitioning  the  variables  into  basic  and  nonbasic  sets,  but 
they  differ  from  the  linear  simplex  method  in  that  the  nonbasic  variables 
are  allowed  to  assume  values  other  than  their  upper  and  lower  bounds.  As 
with  the  linear  simplex  method,  the  convex  simplex  method  allows  a single 
nonbasic  to  change  at  each  iteration  while  any  number  of  nonbasics  may  change 
with  the  reduced  gradient  method. 

Suppose  we  partition  Ax  = r into  [B  ■ N]x  = r where  B is  a basis.  Like- 

wise,  partition  x into  [x  I x ] and  u into  [u  | u'^] . Then  (1)  - (3)  may 

be  stated  as  follows: 

■ 4 /r  B . N,, 

min  g([x  ; X ]) 

B -1  -1  M 

s.t.  X =B  r-B  Nx‘ 

B B 
0 £ X £ u 

N N 
0 < X < u . 


g 

After  substituting  for  x we  obtain 


min 

f(x”) 

s.t. 

0 £ B“^r  - b”^Nx 

-N  ^ N 

0 ^ X < u , 

where  f(x  ) = g((B  r-B  ^Nx^  I x^l).  By  making  an  analogous  partitioning  of 
Vg,  i.e.  Vg(x)  = [7g  (x)  I Vg^(x) ] , we  obtain 

7f(x^)  = 7g^’(x)  - 7g®(x)B“Hl.  (18) 
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Hence,  'vf(x^)d/|d(  is  Che  directional  derivative  in  nonbasic  space.  For 
the  convex  simplex  method,  d is  always  a vector  with  one  nonzero  entry  which 
is  either  1 or  -1.  The  components  of  (18)  play  the  role  of  the  reduced  costs 
in  the  linear  simplex  method.  Given  the  direction  of  change  (i.e.  the  non- 
basic variable  to  change)  a one  dimensional  search  is  required  to  determine 
the  magnitude  of  the  change.  An  actual  simplex  pivot  is  performed  only  if 
Che  resulting  change  forces  one  of  the  basic  variables  to  zero  or  upper  bound. 
An  illustration  of  the  convex  simplex  method  is  given  in  Figure  4. 


Figure  4 About  Here 


Recall  that  the  graph  associated  vrith  (1)  - (3)  is  defined  by  Che  node  set 
V and  the  arc  set  E.  Following  the  notation  of  Johnson  [25]  a 6-urtp.tz  pdCli 
in  [V,E]  is  a sequence  of  alternating  nodes  and  connecting  edges  such  that  no 
node  (or  arc)  is  repeated.  The  direction  of  the  arcs  is  unimportant  in  de- 
fining a simple  path  and  both  i,  (i,j),  ( j ,k) , k and  i,  (j,i),  j,  (j ,k) , k 
are  simple  paths  connecting  nodes  i and  k.  A ciJcZz  is  a simple  path  together 
with  an  arc  joining  the  beginning  node  and  ending  node  of  the  path.  A 
connzctzd  gAjzph  is  a collection  of  nodes  and  arcs  having  at  least  one  simple 
path  between  every  pair  of  nodes,  and  a tree  is  connected  graph  with  no  cycles. 

Since  A has  rank  N - 1,  we  add  an  artificial  variable  to  some  row  (node), 
say  row  I,  to  obtain  the  followina  system. 

Act  + e^a  - r,  (19) 

where  e^  is  a vector  with  a 1 in  Che  position  and  zeroes  elsewhere.  The 
variable  d does  not  appear  in  the  objective  function  (1)  and  the  upper  bound 


-27- 


for  a is  set  to  zero.  Let  B be  any  basis  of  (19).  Clearl 


y a appears  in  every 


such  basis.  Then  the  subgraph  generated  by  B,  called  , consists  of  all 
vertices  of  V and  arcs  corresponding  to  columns  of  (19)  in  B.  The  artificial 
variable  may  be  thought  of  as  an  arc  incident  to  a single  node,  i.e.,  node  £. 
It  is  well-known  that  is  a tree,  if  and  only  if  B is  a basis  [2,  25]. 

Node  t is  called  the  root  of  the  tree. 

Let  denote  the  flow  in  arc  j . Using  this  terminology  the  convex 
simplex  method  is  now  presented. 


ALG-5:  CCWEK  SJy,PLEX  ALGORITHM 


1 


0. 


1. 


I nctltttczcLtion 

Let  y be  any  feasible  point  and  partition  A into  [B  ; N]. 

P-TLcMig 

Set  c^  Vg^(y)  - 7g^(y)B 


N N 

Any  non-basic  arc  j having  c > 0 and  y > 0 or  c,  < 0 and  y < u is  a 

J J J J J 

candidate  for  flow  change.  If  no  such  arc  exists,  terminate;  otherwise,  let 
<j)  denote  the  entering  arc. 


2.  Ratio  Te^t 

Suppose  the  entering  arc  has  its  tail  at  u and  its  head  at  v.  The  ratio 
test  requires  that  one  determine  the  orientation  of  the  arcs  on  the  simple 
path  from  u to  v in  F . Let  r"^  denote  the  set  of  arcs  in  this  path  from 

O 

u to  V that  are  traversed  in  normal  direction  (i.e.,  tail  to  head)  and 
let  R denote  the  set  of  arcs  in  this  path  traversed  in  reverse  direction. 
Then  the  maximum  change  in  X^  is  given  by 
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r 


min  X , min  (u.  - X ),  u - X | , if  c,  < 0 


jeR 


■j  j'’  i 9 


min  X.,  min  (u  - X ) , X ] , if  c.  > 0 


min 


■j  y'  9 


V. 


jeR 


jeR 


If  V ■ 0,  go  to  5. 


3. 


Line  SejVLcJi 

Define  the  direction  vector  as  follows: 


^1,  if  arc  j is  in  r"*”. 


1,  if  arc  j is  in  R , 
1,  for  j “ 9, 

^0,  otherwise. 


Let  a denote  the  solution  to  the  following 


g(y  + ad»  , if  c,  < 0 


min  ( g(y  + ad»  , 

0 < a < A V J 

min  { g(y  + ad)>, 

-A  £ a £ 0 J 


g(y  + Od)),  if  c^  > 0. 


4 . Update.  FZom 

X.  K + a*d 

If  I a I < A,  return  to  step  1. 

‘^(fc  ^ ” ^9  ~ ^(i)’  *^9  ^ ® ^ ~ 

55  Pivot  Required? 

Some  basic  variable  is  either  at  0 or  upper  bound.  Perform  a pivot 
entering  in  place  of  one  of  these  basic  variables  and  go  to  1. 
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Specialization  of  the  pricing  and  line  search  operations  for  the  convex 
simplex  method  have  been,  developed  In  [21] . Computational  experience  using 
this  approach  may  be  found  in  [9,  36,  42]. 

Using  the  same  terminology  the  reduced  gradient  method  is  now  presented. 

ALG-6;  REVUCEV  GRADIENT  ALGORITHM 

0.  In^ytLoLizoitLon 

Let  y be  any  feasible  point  and  partition  A into  [3  j N].  Renumber  the 

variables  such  that  y = [y®  ! y^]  and  y®  = [v, ,...,y  ] and  v^=  [v  . y 1 

1 m - V..  , 


Choose  a termination  parameter  e > 0. 

Ciitc.ula.tz  Reduced  Gnxidtzyvt  and  Vttzc.tLon  In  Monbcatc  Spacz 
N , „ B 


Vg®(y)B  ^ - Vg^(y). 


r N 

.if0<yj<uj 


— 1 , . . . ,n 


N N 

c.  , if  c.  <0  and  y.  = u. 

J J 1 j 

c , if  c > 0 and  y.  = 0 

i j 3 

0,  otherwise 


If  d ..  < £,  j = l,...,n,  terminate. 

m+] 


2.  Catzuiatz  Vtxzdtion  In  Sa^tz  Space 

Set  dj  0,  j = l,...,m.  Let  arc  j be  denoted  by  (t^  , hj).  Let 

denote  the  set  of  arcs  in  the  path  from  t.  to  h.  in  F„  that  are  traversed 

J J o 

in  normal  direction  and  let  denote  the  remaining  arcs  in  this  path. 

a.  [Initialization]  k ■*-  m + 1. 

b.  [Cycle  Trace  Required?]  If  d,^  = 0,  go  to  d. 
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y-^y  + od.  If  a < return  to  1. 

6.  P-tvo-C 

Some  basic  variable  Is  at  zero  or  upper  bound.  Perform  a pivot  letting 
some  nonbaslc  variable  having  d , , ?*  0 enter  the  basis  in  place  of 
some  basic  variable  at  zero  or  upper  bound  and  go  to  1. 
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Computational  experience  with  both  the  convex  simplex  method  and 
reduced  gradient  method  may  be  found  in  (42).  In  addition,  mixed 
methods  combining  both  techniques  have  also  been  suggested  in  (42). 
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Figure  3. 


Illustration  of  Frank-Wolfe  Method  Using 
Away  Steps.  ( — — , toward  step  directon; 
— ,away  step  direction.) 
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